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AUDIOVISUAL INFORMATION MANAGEMENT SYSTEM 



BACKGROUND OF THE INVENTION 

The present invention relates to a system for 
5 managing audiovisual information, and in particular to a 
system for audiovisual information browsing, filtering, 
searching, archiving, and personalization. 

Video cassette recorders (VCRs) may record 
video programs in response to pressing a record button or 
10 may be programmed to record video programs based on the 
-time of day. However, the viewer must program the VCR 
based on information from a television guide to identify 
relevant programs to record. After recording, the viewer 
scans through the entire video tape to select relevant 
15 portions of the program for viewing using the 

functionality provided by the VCR, such as fast forward 
and fast reverse. Unfortunately, the searching and 
viewing is based on a linear search, which may require 
significant time to locate the desired portions of the ■ 

2 0 program (s) and fast forward to the desired portion of the 

tape. In addition, it is time consuming to program the 
VCR in light of the television guide to record desired 
programs. Also, unless the viewer recognizes the 
programs from the television guide as desirable it is 
25 unlikely that the viewer will select such programs to be 
recorded. 

RePlayTV and TiVo have developed hard disk 
based systems that receive, record, and play television 
broadcasts in a manner similar to a VCR. The systems may 

3 0 be programmed with the viewer's viewing preferences. The 

systems use a telephone line interface to receive 
scheduling information similar to that available from a 
television guide. Based upon the system programming and 
the scheduling information, the system automatically 
3 5 records programs that may be of potential interest to the 
viewer. Unfortunately, viewing the recorded programs 
occurs in a linear manner and may require substantial 
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time. In addition, each system must be programmed for an 
individual's preference, likely in a different manner. 

Freeman et al . , U.S. Patent No. 5,861,881, 
disclose an interactive computer system where subscribers 
5 can receive individualized content. 

With all the aforementioned systems, each 
individual viewer is required to program the device 
according to his particular viewing preferences. 
Unfortunately, each different type of device has 

10 different capabilities and limitations which limit the 
selections of the viewer. In addition, each device 
includes a different interface which the viewer may be 
unfamiliar with. Further, if the operator's manual is 
inadvertently misplaced it may be difficult for the 

15 viewer to efficiently program the device. 

BRIEF SUMMARY OF THE INVENTION 

The present invention overcomes the 
aforementioned drawbacks of the prior art by providing a 

2 0 method of using a system with at least one of audio, 
image, and a video comprising a plurality of frames 
comprising the steps of providing a usage preferences 
description scheme where the usage preference description 
scheme includes at least one of a browsing preferences 

25 description scheme, a filtering preferences description 
scheme, a search preferences description scheme, and a 
device preferences description scheme. The browsing 
preferences description scheme relates to a user's 
viewing preferences. The filtering and search 

30 preferences description schemes relate to at least one of 
(1) content preferences of the at least one of audio, 
image, and video, (2) classification preferences of the 
at least one of audio, image, and video, (3) keyword 
preferences of the at least one of audio, image, and 

35 video, and (4) creation preferences of the at least one 
of audio, image, and video. The device preferences 
description scheme relates to user's preferences 



regarding presentation characteristics. A usage history 
description scheme is provided where the usage preference 
description scheme includes at least one of a browsing 
history description scheme, a filtering history 
5 description scheme, a search history description scheme, 
and a device usage history description scheme. The 
browsing history description scheme relates to a user's 
viewing preferences. The filtering and search history 
description schemes relate to at least one of (1) content 

10 usage history of the at least one of audio, image, and 
video, (2) classification usage history of the at least 
one of audio, image, and video, (3) keyword usage history 
of the at least one of audio, image, and video, and (4) 
creation usage history of the at least one of audio, 

15 image, and video. The device usage history description 
scheme relates to user's preferences regarding 
presentation characteristics. The usage preferences 
description scheme and the usage history description 
scheme are used to enhance system functionality. 

20 The foregoing and other objectives, features 

and advantages of the invention will be more readily 
understood upon consideration of the following detailed 
description of the invention, taken in conjunction with 
the accompanying drawings . 

25 

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 

FIG. 1 is an exemplary embodiment of a program, 
a system, and a user, with associated description 
schemes, of an audiovisual system of the present 
3 0 invention. 

FIG. 2 is an exemplary embodiment of the 
audiovisual system, including an analysis module, of 
FIG. 1. 

FIG. 3 is an exemplary embodiment of the 
35 analysis module of FIG. 2. 

FIG. 4 is an illustration of a thumbnail view 
(category) for the audiovisual system. 
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FIG. 5 is an illustration of 
(channel) for the audiovisual system. 

FIG. 6 is an illustration of 
(channel) for the audiovisual system. 
5 FIG. 7 is an illustration of 

. the audiovisual system. 

FIG. 8 is an illustration of 
the audiovisual system. 

FIG. 9 is an illustration of 
10 the audiovisual system. 

FIG. 10 is an illustration of a highlight view 
for the audiovisual system. 

FIG. 11 is an illustration of an event view for 
the audiovisual system. 
15 FIG. 12 is an illustration of a 

character/object view for the audiovisual system. 

FIG. 13 is an alternative embodiment of a 
program description scheme including a syntactic , 
structure description scheme, a semantic structure 
20 description scheme, a visualization description scheme, 
and a meta information description scheme. 

FIG. 14 is an exemplary embodiment of the 
visualization description scheme of FIG. 13. 

FIG. 15 is an exemplary embodiment of the meta 
25 information description scheme of FIG. 13. 

FIG. 16 is an exemplary embodiment of a segment 
description scheme for the syntactic structure 
description scheme of FIG. 13. 

FIG. 17 is an exemplary embodiment of a region 
3 0 description scheme for the syntactic structure - 
description scheme of FIG. 13. 

FIG. 18 is an exemplary embodiment of a 
segment/region relation description scheme for the 
syntactic structure description scheme of FIG. 13. 
3 5 * FIG. 19 is an exemplary embodiment of an event 

description scheme for the semantic structure description 
scheme of FIG. 13 . 
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FIG. 2 0 is an exemplary embodiment of an object 
description scheme for the semantic structure description 
scheme of FIG. 13. 

FIG. 21 is an exemplary embodiment of an 
5 event /object relation graph description scheme for the 
syntactic structure description scheme of FIG. 13. 

FIG. 22 is an exemplary embodiment of a user 
preference description scheme. 

FIG. 23 is an exemplary embodiment of the 
10 interrelationship between a usage history description 
scheme, an agent, and the usage preference description 
scheme of FIG. 22. 

FIG. 24 is an exemplary embodiment of the 
interrelationship between audio and/or video programs 
15 together with their descriptors, user identification, and 
the usage preference description scheme of FIG. 22. 

FIG. 25 is an exemplary embodiment of a usage 
preference description scheme of FIG. 22. 

FIG. 2 6 is an exemplary embodiment of the 
20 interrelationship between the usage description schemes 
and an MPEG- 7 description schemes. 

FIG. 2 7 is an exemplary embodiment of a usage 
history description scheme of FIG. 22. 

FIG. 28 is an exemplary system incorporating 
25 the user history description scheme. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

Many households today have many sources of 
audio and video information, such as multiple television 

3 0 sets, multiple VCR 1 s , a home stereo, a home entertainment 
center, cable television, satellite television, internet 
broadcasts, world wide web, data services, specialized 
Internet services, portable radio devices, and a stereo 
in each of their vehicles. For each of these devices, a 

35 different interface is normally used to obtain, select, 
record, and play the video and/or audio content. For 
example, a VCR permits the selection of the recording 
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times but the user has to correlate the television guide 
with the desired recording times. Another example is the 
user selecting a preferred set of preselected radio 
stations for his home stereo and also presumably 
5 selecting the same set of preselected stations for each 
of the user's vehicles. If another household member 
desires a different set of preselected stereo selections, 
the programming of each audio device would need to be 
reprogrammed at substantial inconvenience. 

10 The present inventors came to the realization 

that users of visual information and listeners to audio 
information, such as for example radio, audio tapes, 
video tapes, movies, and news, desire to be entertained 
and informed in more than merely one uniform manner. In 

15 other words, the audiovisual information presented to a 

particular user should be in a format and include content 
suited to their particular viewing preferences. In 
addition, the format should be dependent on the content 
of the particular audiovisual information. The amount of 

20 information presented to a user or a listener should be 
limited to only the amount of detail desired by the 
particular user at the particular time. For example with 
the ever increasing demands on the user's time, the user 
may desire to watch only 10 minutes of or merely the 

25 highlights of a basketball game. In addition, the 
present inventors came to the realization that the 
necessity of programming multiple audio and visual 
devices with their particular viewing preferences is a 
burdensome task, especially when presented with 

3 0 unfamiliar recording devices when traveling. When 

traveling, users desire to easily configure unfamiliar 
devices, such as audiovisual devices in a hotel room, 
with their viewing and listening preferences in a 
efficient manner. 

3 5 The present inventors came to the further 

realization that a convenient technique of merely 
recording the desired audio and video information is not 
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sufficient because the presentation of the information 
should be in a manner that is time efficient, especially 
in light of the limited time frequently available for 
the presentation of such information. In addition, the 
5 user should be able to access only that portion of all of 
the available information that the user is interested in, 
while skipping the remainder of the information. 

A user is not capable of watching or otherwise 
listening to the vast potential amount of information 

10 available through all, or even a small portion of, the 
sources of audio and video information. In addition, 
with the increasing information potentially available, 
the user is not likely even aware of the potential 
content of information that he may be interested in. In 

15 light of the vast amount of audio, image, and video 
information, the present inventors came to the 
realization that a system that records and presents to 
the user audio and video information based upon the 
user's prior viewing and listening habits, preferences, 

20 and personal characteristics, generally referred to as 

user information, is desirable. In addition, the system 
may present such information based on the capabilities of 
the system devices. This permits the system to record 
desirable information and to customize itself 

25 automatically to the user and/or listener. It is to be 
understood that user, viewer, and/or listener terms may 
be used interchangeability for any type of content. 
Also, the user information should be portable between and 
usable by different devices so that other devices may 

30 likewise be configured automatically to the particular 
user's preferences upon receiving the viewing 
information . 

In light of the foregoing realizations and 
motivations, the present inventors analyzed a typical 

3 5 audio and video presentation environment to determine the 
significant portions of the typical audiovisual 
environment. First, referring to FIG. 1 the video, 
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image, and/or audio information 10 is provided or 
otherwise made available to a user and/or a (device) 
system. Second, the video, image, and/or audio 
information is presented to the user from the system 12 
5 (device), such as a television set or a radio. Third, 
the user interacts both with the system (device) 12 to 
view the information 10 in a desirable manner and has 
preferences to define which audio, image, and/or video 
information is obtained in accordance with the user 

10 information 14. After the proper identification of the 
different major aspects of an audiovisual system the 
present inventors then realized that information is 
needed to describe the informational content of each 
portion of the audiovisual system 16. 

15 With three portions of the audiovisual 

presentation system 16 identified, the functionality of 
each portion is identified together with its 
interrelationship to the other portions. To define the 
necessary interrelationships, a set of description 

2 0 schemes containing data describing each portion is 

defined. The description schemes include data that is 
auxiliary to the programs 10, the system 12, and the user 
14, to store a set of information, ranging from human 
readable text to encoded data, that can be used in 

25 enabling browsing, filtering, searching, archiving, and 
personalization. By providing a separate description 
scheme describing the program (s) 10, the user 14, and the 
system 12, the three portions (program, user, and system) 
may be combined together to provide an interactivity not 

30 previously achievable. In addition, different programs 
10, different users 14, and different systems 12 may be 
combined together in any combination, while still 
maintaining full compatibility and functionality. It is 
to be understood that the description scheme may contain 

35 the data itself or include links to the data, as desired. 

A program description scheme 18 related to the 
video, still image, and/or audio information 10 
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preferably includes two sets of information, namely, 
program views and program profiles. The program views 
define logical structures of the frames of a video that 
define how the video frames are potentially to be viewed 
5 suitable for efficient browsing. For example the program 
views may contain a set of fields that contain data for 
the identification of key frames, segment definitions 
between shots, highlight definitions, video summary 
definitions, different lengths of highlights, thumbnail 

10 set of frames, individual shots or scenes, representative 
frame of the video, grouping of different events, and a 
close-up view. The program view descriptions may contain 
thumbnail, slide, key frame, highlights, and close-up 
views so that users can filter and search not only at the 

15 program level but also within a particular program. The 
description scheme also enables users to access 
information in varying detail amounts by supporting, for 
example, a key frame view as a part of a program view 
providing multiple levels of summary ranging from coarse 

20 to fine. The program profiles define distinctive 

characteristics of the content of the program, such as 
actors, stars, rating, director, release date, time 
stamps, keyword identification, trigger profile, still 
profile, event profile, character profile, object 

25 profile, color profile, texture profile, shape profile, 

motion profile, and categories. The program profiles are 
especially suitable to facilitate filtering and searching 
of the audio and video information. The description 
scheme enables users to have the provision of discovering 

3 0 interesting programs that they may be unaware of by 
providing a user description scheme. The user 
description scheme provides information to a software 
agent that in turn performs a search and filtering on 
behalf of the user by possibly using the system 

35 description scheme and the program description scheme 

information. It is to be understood that in one of the 
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embodiments of the invention merely the program 
description scheme is included. 

Program views contained in the program 
description scheme are a feature that supports a 
5 functionality such as close-up view. In the close-up 

view, a certain image object, e.g., a famous basketball 
player such as Michael Jordan, can be viewed up close by 
playing back a close-up sequence that is separate from 
the original program. An alternative view can be 

10 incorporated in a straightforward manner. Character 
profile on the other hand may contain spatio-temporal 
position and size of a rectangular region around the 
character of interest . This region can be enlarged by 
the presentation engine, or the presentation engine may 

15 darken outside the region to focus the user's attention 
to the characters spanning a certain number of frames. 
Information within the program description scheme may 
contain data about the initial size or location of the 
region, movement of the region from one frame to another, 

20 and duration and terms of the number of frames featuring 
the region. The character profile also provides 
provision for including text annotation and audio 
annotation about the character as well as web page 
information, and any other suitable information. Such 

25 character profiles may include the audio annotation which 
is separate from and in addition to the associated audio 
track of the video. 

The program description scheme may likewise 
contain similar information regarding audio (such as 

3 0 radio broadcasts) and images (such as analog or digital 
photographs or a frame of a video) . 

The user description scheme 20 preferably 
includes the user's personal preferences, and information 
regarding the user's viewing history such as for example 

35 browsing history, filtering history, searching history, 
and device setting history. The user's personal 
preferences includes information regarding particular 
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programs and categorizations of programs that the user 
prefers to view. The user description scheme may also 
include personal information about the particular user, 
such as demographic and geographic information, e.g. zip 
5 code and age. The explicit definition of the particular 
programs or attributes related thereto permits the system 
16 to select those programs from the information 
contained within the available program description 
schemes 18 that may be of interest to the user. 

10 Frequently, the user does not desire to learn to program 
the device nor desire to explicitly program the device. 
In addition, the user description scheme 20 may not be 
sufficiently robust to include explicit definitions 
describing all desirable programs for a particular user. 

15 In such a case, the capability of the user description 
scheme 2 0 to adapt to the viewing habits of the user to 
accommodate different viewing characteristics not 
explicitly provided for or otherwise difficult to, 
describe is useful. In such a case, the user description 

2 0 scheme 2 0 may be augmented or any technique can be used 
to compare the information contained in the user 
description scheme 20 to the available information 
contained in the program description scheme 18 to make 
selections. The user description scheme provides a 

2 5 technique for holding user preferences ranging from 

program categories to program views, as well as usage 
history. User description scheme information is 
persistent but can be updated by the user or by an 
intelligent software agent on behalf of the user at any 

30 arbitrary time. It may also be disabled by the user, at 
any time, if the user decides to do so. In addition, the 
user description scheme is modular and portable so that 
users can carry or port it from one device to another, 
such as with a handheld electronic device or smart card 

35 or transported over a network connecting multiple 

devices. When user description scheme is standardized 
among different manufacturers or products, user 
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preferences become portable. For example, a user can 
personalize the television receiver in a hotel room 
permitting users to access information they prefer at any- 
time and anywhere. In a sense, the user description 
5 scheme is persistent and timeless based. In addition, 
selected information within the program description 
scheme may be encrypted since at least part of the 
information may be deemed to be private (e.g., 
demographics) . A user description scheme may be 

10 associated with an audiovisual program broadcast and 

compared with a particular user's description scheme of 
the receiver to readily determine whether or not the 
program's intended audience profile matches that of the 
user. It is to be understood that in one of the 

15 embodiments of the invention merely the user description 
scheme is included. 

The system description scheme 22 preferably 
manages the individual programs and other data. The 
management may include maintaining lists of programs, 

20 categories, channels, users, videos, audio, and images. 
The management may include the capabilities of a device 
for providing the audio, video, and/or images. Such 
capabilities may include, for example, screen size, 
stereo, AC3 , DTS, color, black/white, etc. The 

2 5 management may also include relationships between any one 
or more of the user, the audio, and the images in 
relation to one or more of a program description 
scheme (s) and a user description scheme (s) . In a similar 
manner the management may include relationships between 

30 one or more of the program description scheme (s) and user 
description scheme (s). It is to be understood that in 
one of the embodiments of the invention merely the system 
description scheme is included. 

The descriptors of the program description 

35 scheme and the user description scheme should overlap, at 
least partially, so that potential desirability of the 
program can be determined by comparing descriptors 
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representative of the same information. For example, the 
program and user description scheme may include the same 
set of categories and actors. The program description 
scheme has no knowledge of the user description scheme, 
5 and vice versa, so that each description scheme is not 
dependant on the other for its existence. It is not 
necessary for the description schemes to be fully 
populated. It is also beneficial not to include the 
program description scheme with the user description 

10 scheme because there will likely be thousands of programs 
with associated description schemes which if combined 
with the user description scheme would result in a 
unnecessarily large user description scheme. It is 
desirable to maintain the user description scheme small 

15 so that it is more readily portable. Accordingly, a 

system including only the program description scheme and 
the user description scheme would be beneficial. 

The user description scheme and the sys.tem 
description scheme should include at least partially 

20 overlapping fields. With overlapping fields the system 
can capture the desired information, which would 
otherwise not be recognized as desirable. The system 
description scheme preferably includes a list of users 
and available programs. Based on the master list of 

25 available programs, and associated program description 

scheme, the system can match the desired programs. It is 
also beneficial not to include the system description 
scheme with the user description scheme because there 
will likely be thousands of programs stored in the system 

30 description schemes which if combined with the user 

description scheme would result in a unnecessarily large 
user description scheme. It is desirable to maintain 

the user description scheme small so that it is more 
readily portable. For example, the user description 

35 scheme may include radio station preselected frequencies 
and/or types of stations, while the system description 
scheme includes the available stations for radio stations 
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in particular cities. When traveling to a different city 
the user description scheme together with the system 
description scheme will permit reprogramming the radio 
stations. Accordingly, a system including only the 
5 system description scheme and the user description scheme 
would be beneficial. 

The program description scheme and the system 
description scheme should include at least partially 
overlapping fields. With the overlapping fields, the 

10 system description scheme will be capable of storing the 
information contained within the program description 
scheme, so that the information is properly indexed. 
With proper indexing, the system is capable of matching 
such information with the user information, if available, 

15 for obtaining and recording suitable programs. If the 
program description scheme and the system description 
scheme were not overlapping then no information would be 
extracted from the programs and stored. System 
capabilities specified within the system description 

20 scheme of a particular viewing system can be correlated 
with a program description scheme to determine the views 
that can be supported by the viewing system. For 
instance, if the viewing device is not capable of playing 
back video, its system description scheme may describe 

25 its viewing capabilities as limited to keyframe view and 
slide view only. Program description scheme of a 
particular program and system description scheme of the 
viewing system are utilized to present the appropriate 
views to the viewing system. Thus, a server of programs 

3 0 serves the appropriate views according to a particular 

viewing system's capabilities, which may be communicated 
over a network or communication channel connecting the 
server with user's viewing device. It is preferred to 
maintain the program description scheme separate from the 

3 5 system description scheme because the content providers 
repackage the content and description schemes in 
different styles, times, and formats. Preferably, the 
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program description scheme is associated with the 
program, even if displayed at a different time. 
Accordingly, a system including only the system 
description scheme and the program description scheme 
5 would be beneficial. 

By preferably maintaining the independence of 
each of the three description schemes while having fields 
that correlate the same information, the programs 10, the 
users 14, and the system 12 may be interchanged with one 

10 another while maintaining the functionality of the entire 
system 16. Referring to FIG. 2, the audio, visual, or 
audiovisual program 38, is received by the system 16. 
The program 3 8 may originate at any suitable source, such 
as for example broadcast television, cable television, 

15 satellite television, digital television, Internet 

broadcasts, world wide web, digital video discs, still 
images, video cameras, laser discs, magnetic media, 
computer hard drive, video tape, audio tape, data . 
services, radio broadcasts, and microwave communications. 

2 0 The program description stream may originate from any 
suitable source, such as for example PSIP/DVB-SI 
information in digital television broadcasts, specialized 
digital television data services, specialized Internet 
services, world wide web, data files, data over the 

25 telephone, and memory, such as computer memory. The 

program, user, and/or system description scheme may be 
transported over a network (communication channel) . For 
example, the system description scheme may be transported 
to the source to provide the source with views or other 

30 capabilities that the device is capable of using. In 
response, the source provides the device with image, 
audio, and/or video content customized or otherwise 
suitable for the particular device. The system 16 may 
include any device (s) suitable to receive any one or more 

35 of such programs 38. An audiovisual program analysis 

module 42 performs an analysis of the received programs 
38 to extract and provide program related information 
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(descriptors) to the description scheme (DS) generation 
module 44. The program related information may be 
extracted from the data stream including the program 3 8 
or obtained from any other source, such as for example 
5 data transferred over a telephone line, data already 

transferred to the system 16 in the past, or data from an 
associated file. The program related information 
preferably includes data defining both the program views 
and the program profiles available for the particular 

10 program 38. The analysis module 42 performs an analysis 
of the programs 3 8 using information obtained from (i) 
automatic audio-video analysis methods on the basis of 
low-level features that are extracted from the 
program(s), (ii) event detection techniques, (iii) data 

15 that is available (or extractable) from data sources or 
electronic program guides (EPGs, DVB-SI, and PSIP) , and 
(iv) user information obtained from the user description 
scheme 20 to provide data defining the program 
description scheme . 

20 The selection of a particular program analysis 

technique depends on the amount of readily available data 
and the user preferences. For example, if a user prefers 
to watch a 5 minute video highlight of a particular 
program, such as a basketball game, the analysis module 

25 42 may invoke a knowledge based system 90 (FIG. 3) to 
determine the highlights that form the best 5 minute 
summary. The knowledge based system 90 may invoke a 
commercial filter 92 to remove commercials and a slow 
motion detector 54 to assist in creating the video 

3 0 summary. The analysis module 42 may also invoke other 
modules to bring information together (e.g., textual 
information) to author particular program views. For 
example, if the program 3 8 is a home video where there is 
no further information available then the analysis module 

35 42 may create a key-frame summary by identifying key- 
frames of a multi -level summary and passing the 
information to be used to generate the program views, and 
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in particular a key frame view, to the description 
scheme. Referring also to FIG. 3, the analysis module 42 
may also include other sub-modules, such as for example, 
a de -mux/decoder 60, a data and service content analyzer 
5 62, a text processing and text summary generator 64, a 

close caption analyzer 66, a title frame generator 68, an 
analysis manager 70, an audiovisual analysis and feature 
extractor 72, an event detector 74, a key- frame 
summarizer 76, and a highlight summarizer 78. 

10 The generation module 44 receives the system 

information 46 for the system description scheme. The 
system information 4 6 preferably includes data for the 
system description scheme 22 generated by the generation 
module 44. The generation module 44 also receives user 

15 information 48 including data for the user description 

scheme. The user information 48 preferably includes data 
for the user description scheme generated within the 
generation module 44. The user input 48 may include, for 
example, meta information to be included in the program 

20 and system description scheme. The user description 

scheme (or corresponding information) is provided to the 
analysis module 42 for selective analysis of the 
program (s) 38. For example, the user description scheme 
may be suitable for triggering the highlight generation 

25 functionality for a particular program and thus 

generating the preferred views and storing associated 
data in the program description scheme. The generation 
module 44 and the analysis module 42 provide data to a 
data storage unit 50. The storage unit 50 may be any 

3 0 storage device, such as memory or magnetic media. 

A search, filtering, and browsing (SFB) module 
52 implements the description scheme technique by parsing 
and extracting information contained within the 
description scheme. The SFB module 52 may perform 

35 filtering, searching, and browsing of the programs 38, on 
the basis of the information contained in the description 
schemes. An intelligent software agent is preferably 
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included within the SFB module 52 that gathers and 
provides user specific information to the generation 
module 44 to be used in authoring and updating the user 
description scheme (through the generation module 44) . 
5 In this manner, desirable content may be provided to the 
user though a display 80. The selections of the desired 
program (s) to be retrieved, stored, and/or viewed may be 
programmed, at least in part, through a graphical user 
interface 82 . The graphical user interface may also 

10 include or be connected to a presentation engine for 
presenting the information to the user through the 
graphical user interface. 

The intelligent management and consumption of 
audiovisual information using the mult i -part description 

15 stream device provides a next -generation device suitable 
for the modern era of information overload. The device 
responds to changing lifestyles of individuals and 
families, and allows everyone to obtain the information 
they desire anytime and anywhere they want. 

2 0 An example of the use of the device may be as 

follows. A user comes home from work late Friday evening 
being happy the work week is finally over. The user 
desires to catch up with the events of the world and then 
watch ABC's 20/20 show later that evening. It is now 9 

25 PM and the 20/20 show will start in an hour at 10 PM . 
The user is interested in the sporting events of the 
week, and all the news about the Microsoft case with the 
Department of Justice. The user description scheme may 
include a profile indicating a desire that the particular 

30 user wants to obtain all available information regarding 
the Microsoft trial and selected sporting events for 
particular teams. In addition, the system description 
scheme and program description scheme provide information 
regarding the content of the available information that 

3 5 may selectively be obtained and recorded. The system, in 

an autonomous manner, periodically obtains and records 
the audiovisual information that may be of interest to 
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the user during the past week based on the three 
description schemes. The device most likely has recorded 
more than one hour of audiovisual information so the 
information needs to be condensed in some manner. The 
5 user starts interacting with the system with a pointer or 
voice commands to indicate a desire to view recorded 
sporting programs. On the display, the user is presented 
with a list of recorded sporting events including 
Basketball and Soccer. Apparently the user's favorite 

10 Football team did not play that week because it was not 

recorded. The user is interested in basketball games and 
indicates a desire to view games. A set of title frames 
is presented on the display that captures an important 
moment of each game. The user selects the Chicago Bulls 

15 game and indicates a desire to view a 5 minute highlight 
of the game. The system automatically generates 
highlights. The highlights may be generated by audio or 
video analysis, or the program description scheme 
includes data indicating the frames that are presented 

20 for a 5 minute highlight. The system may have also 
recorded web-based textual information regarding the 
particular Chicago-Bulls game which may be selected by 
the user "for viewing. If desired, the summarized 
information may be recorded onto a storage device, such 

25 as a DVD with a label. The stored information may also 
include an index code so that it can be located at a 
later time. After viewing the sporting events the user 
may decide to read the news about the Microsoft trial . 
It is now 9:50 PM and the user is done viewing the news. 

30 In fact, the user has selected to delete all the recorded 
news items after viewing them. The user then remembers 
to do one last thing before 10 PM in the evening. The 
next day, the user desires to watch the VHS tape that he 
received from his brother that day, containing footage 

35 about his brother's new baby girl and his vacation to 

Peru last summer. The user wants to watch the whole 2- 
hour tape but he is anxious to see what the baby looks 
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like and also the new stadium built in Lima, which was 
not there last time he visited Peru. The user plans to 
take a quick look at a visual summary of the tape, 
browse, and perhaps watch a few segments for a couple of 
5 minutes, before the user takes his daughter to her piano 
lesson at 10 AM the next morning. The user plugs in the 
tape into his VCR, that is connected to the system, and 
invokes the summarization functionality of the system to 
scan the tape and prepare a summary. The user can then 

10 view the summary the next morning to quickly discover the 
baby's looks, and playback segments between the key- 
frames of the summary to catch a glimpse of the crying 
baby. The system may also record the tape content onto 
the system hard drive (or storage device) so the video 

15 summary can be viewed quickly. It is now 10:10 PM, and 
it seems that the user is 10 minutes late for viewing 
20/20. Fortunately, the system, based on the three 
description schemes, has already been recording 2,0/2 0 
since 10 PM. Now the user can start watching the 

20 recorded portion of 20/20 as the recording of 20/20 

proceeds. The user will be done viewing 20/20 at 11:10 
PM. 

The average consumer has an ever increasing 
number of multimedia devices, such as a home audio 

25 system, a car stereo, several home television sets, web 
browsers, etc. The user currently has to customize each 
of the devices for optimal viewing and/or listening 
preferences. By storing the user preferences on a 
removable storage device, such as a smart card, the user 

30 may insert the card including the user preferences into 
such media devices for automatic customization. This 
results in the desired programs being automatically 
recorded on the VCR, and setting of the radio stations 
for the car stereo and home audio system. In this manner 

35 the user only has to specify his preferences at most 
once, on a single device and subsequently, the 
descriptors are automatically uploaded into devices by 
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the removable storage device. The user description 
scheme may also be loaded into other devices using a 
wired or wireless network connection, e.g. that of a home 
network. Alternatively, the system can store the user 
5 history and create entries in the user description scheme 
based on the ' s audio and video viewing habits. In this 
manner, the user would never need to program the viewing 
information to obtain desired information. In a sense, 
the user descriptor scheme enables modeling of the user 

10 by providing a central storage for the user's listening, 
viewing, browsing preferences, and user's behavior. This 
enables devices to be quickly personalized, and enables 
other components, such as intelligent agents, to 
communicate on the basis of a standardized description 

15 format, and to make smart inferences regarding the user's 
preferences . 

Many different realizations and applications 
can be readily derived from FIGS. 2 and 3 by 
appropriately organizing and utilizing their different 

20 parts, or by adding peripherals and extensions as needed. 
In its most general form, FIG. 2 depicts an audiovisual 
searching, filtering, browsing, and/or recording 
appliance that is personalizable . The list of more 
specific applications/implementations given below is not 

2 5 exhaustive but covers a range. 

The user description scheme is a major enabler 
for personalizable audiovisual appliances. If the 
structure (syntax and semantics) of the description 
schemes is known amongst multiple appliances, the user 

30 (user) can carry (or otherwise transfer) the information 
contained within his user description scheme from one 
appliance to another, perhaps via a smart card- -where 
these appliances support smart card interface-- in order 
to personalize them. Personalization can range from 

35 device settings, such as display contrast and volume 
control, to settings of television channels, radio 
stations, web stations, web sites, geographic 



1 




22 

information, and demographic information such as age, zip 
code etc. Appliances that can be personalized may access 
content from different sources. They may be connected to 
the web, terrestrial or cable broadcast, etc., and they 
5 may also access multiple or different types of single 
media such as video, music, etc. 

For example, one can personalize the car stereo 
using a smart card plugged out of the home system and 
plugged into the car stereo system to be able to tune to 

10 favorite stations at certain times. As another example, 

one can also personalize television viewing, for example, 
by plugging the smart card into a remote control that in 
turn will autonomously command the television receiving 
system to present the user information about current and 

15 future programs that fits the user's preferences. 
Different members of the household can instantly 
personalize the viewing experience by inserting their own 
smart card into the family remote. In the absence of such 
a remote, this same type of personalization can be 

2 0 achieved by plugging in the smart card directly to the 

television system. The remote may likewise control audio 
systems. In another implementation, the television 
receiving system holds user description schemes for 
multiple users (users) in local storage and identify 
25 different users (or group of users) by using an 

appropriate input interface. For example an interface 
using user-voice identification technology. It is noted 
that in a networked system the user description scheme 
may be transported over the network. 

3 0 The user description scheme is generated by 

direct user input, and by using a software that watches 
the user to determine his/her usage pattern and usage 
history. User description scheme can be updated in a 
dynamic fashion by the user or automatically. A well 
35 defined and structured description scheme design allows 
different devices to interoperate with each other. A 
modular design also provides portability. 
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The description scheme adds new functionality 
to those of the current VCR. An advanced VCR system can 
learn from the user via direct input of preferences, or 
by watching the usage pattern and history of the user. 
5 The user description scheme holds user's preferences 

users and usage history. An intelligent agent can then 
consult with the user description scheme and obtain 
information that it needs for acting on behalf of the 
user. Through the intelligent agent, the system acts on 

10 behalf of the user to discover programs that fit the 

taste of the user, alert the user about such programs, 
and/or record them autonomously. An agent can also manage 
the storage in the system according to the user 
description scheme, i.e., prioritizing the deletion of 

15 programs (or alerting the user for transfer to a 

removable media) , or determining their compression factor 
(which directly impacts their visual quality) according 
to user's preferences and history. 

The program description scheme and the system 

2 0 description scheme work in collaboration with the user 

description scheme in achieving some tasks. In addition, 
the program description scheme and system description 
scheme in an advanced VCR or other system will enable the 
user to browse, search, and filter audiovisual programs. 

25 Browsing in the system offers capabilities that are well 
beyond fast forwarding and rewinding. For instance, the 
user can view a thumbnail view of different categories of 
programs stored in the system. The user then may choose 
frame view, shot view, key frame view, or highlight view, 

30 depending on their availability and user's preference. 
These views can be readily invoked using the relevant 
information in the program description scheme, especially 
in program views. The user at any time can start viewing 
the program either in parts, or in its entirety. 

35 In this application, the program description 

scheme may be readily available from many services such 
as: (i) from broadcast (carried by EPG defined as a part 
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of ATSC-PSIP (ATSC-Program Service Integration Protocol) 
in USA or DVB-SI (Digital Video Broadcast - Service 
Information) in Europe) ; (ii) from specialized data 
services (in addition to PSIP/DVB-SI) ; (iii) from 
5 specialized web sites; (iv) from the media storage unit 
containing the audiovisual content (e.g., DVD); (v) from 
advanced cameras (discussed later) , and/or may be 
generated (i.e., for programs that are being stored) by 
the analysis module 42 or by user input 48. 

10 - Contents of digital still and video cameras can 

be stored and managed by a system that implements the 
description schemes, e.g., a system as shown in FIG." 2. 
Advanced cameras can store a program description scheme, 
for instance, in addition to the audiovisual content 

15 itself. The program description scheme can be generated 
either in part or in its entirety on the camera itself 
via an appropriate user input interface (e.g., speech, 
visual menu drive, etc.) . Users can input to the camera 
the program description scheme information, especially 

20 those high-level (or semantic) information that may 

otherwise be difficult to automatically extract by the 
system. Some camera settings and parameters (e.g., date 
and time) , as well as quantities computed in the camera 
(e.g., color histogram to be included in the color 

25 profile) , can also be used in generating the program 

description scheme. Once the camera is. connected, the 
system can browse the camera content, or transfer the 
camera content and its description scheme to the local 
storage for future use. It is also possible to update or 

30 add information to the description scheme generated in 
the camera. 

The IEEE 13 94 and Havi standard specifications 
enable this type of "audiovisual content" centric 
communication among devices. The description scheme 
3 5 API's can be used in the context of Havi to browse and/or 
search the contents of a camera or a DVD which also 
contain a description scheme associated with their 
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content, i.e., doing more than merely invoking the PLAY 
API to play back and linearly view the media. 

The description schemes may be used in 
archiving audiovisual programs in a 
5 database. The search engine uses the information 

contained in the program description scheme to retrieve 
programs on the basis of their content. The program 
description scheme can also 

be used in navigating through the contents of the 

10 database or the query results. The user description 
scheme can be used in prioritizing the results of the 
user query during presentation. It is possible of course 
to make the program description scheme more comprehensive 
depending on the nature of the particular application. 

15 The description scheme fulfills the user's 

desire to have applications that pay attention and are 
responsive to their viewing and usage habits, 
preferences, and personal demographics. The proposed 
user description scheme directly addresses this desire in 

20 its selection of fields and interrelationship to other 

description schemes. Because the description schemes are 
modular in nature, the user can port his user description 
scheme from one device to another in order to 
"personalize" the device. 

25 The proposed description schemes can be 

incorporated into current products similar to those from 
TiVo and Replay TV in order to extend their entertainment 
informational value. In particular, the description 
scheme will enable audiovisual browsing and searching of 

30 programs and enable filtering within a particular program 
by supporting multiple program views such as the 
highlight view. In addition, the description scheme will 
handle programs coming from sources other than television 
broadcasts for which TiVo and Replay TV are not designed 

35 to handle. In addition, by standardization of TiVo and 
Replay TV type of devices, other products may be 
interconnected to such devices to extend their 
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capabilities, such as devices supporting an MPEG 7 
description. MPEG-7 is the Moving Pictures Experts Group 
- 7, acting to standardize descriptions and description 
schemes for audiovisual information. The device may also 
5 be extended to be personalized by multiple users, as 
desired . 

Because the description scheme is defined, the 
intelligent software agents can communicate among 
themselves to make intelligent inferences regarding the 

10 user's preferences. In addition, the development and 

upgrade of intelligent software agents for browsing and 
filtering applications can be simplified based on the 
standardized user description scheme. 

The description scheme is mult i -modal in the 

15 following sense that it holds both high level (semantic) 
and low level features and/or descriptors. For example, 
the high and low level descriptors are actor name and 
motion model parameters, respectively. High level 
descriptors are easily readable by humans while low level 

2 0 descriptors are more easily read by machines and less 

understandable by humans. The program description scheme 
can be readily harmonized with existing EPG, PSIP, and 
DVB-SI information facilitating search and filtering of 
broadcast programs. Existing services can be extended in 

2 5 the future by incorporating additional information using 

the compliant description scheme. 

For example, one case may include audiovisual 
programs that are prerecorded on a media such as a 
digital video disc where the digital video disc also 

3 0 contains a description scheme, that has the same syntax 

and semantics of the description scheme that the FSB 
module uses. .If the FSB module uses a different 
description scheme, a transcoder (converter) of the 
description scheme may be employed. The user may want to 
3 5 browse and view the content of the digital video disc. 
In this case, the user may not need to invoke the 
analysis module to author a program description. 
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However, the user may want to invoke his or her user 
description scheme in filtering, searching and browsing 
the digital video disc content. Other sources of program 
information may likewise be used in the same manner. 
5 It is to be understood that any of the 

techniques described herein with relation to video are 
equally applicable to images (such as still image or a 
frame of a video) and audio (such as radio) . 

An example of an audiovisual interface is shown 

10 in FIGS. 4-12 which is suitable for the preferred 

audiovisual description scheme. Referring to FIG. 4, by 
selecting the thumbnail function as a function of 
category provides a display with a set of categories on 
the left hand side. Selecting a particular category, 

15 such as news, provides a set of thumbnail views of 
different programs that are currently available for 
viewing. In addition, the different programs may also 
include programs that will be available at a different 
time for viewing. The thumbnail views are short video 

20 segments that provide an indication of the content of the 
respective actual program that it corresponds with. 
Referring to FIG. 5, a thumbnail view of available 
programs in terms of channels may be displayed, if 
desired. Referring to FIG. 6, a text view of available 

25 programs in terms of channels may be displayed, if 

desired. Referring to FIG. 7, a frame view of particular 
programs may be displayed, if desired. A representative 
frame is displayed in the center of the display with a 
set of representative frames of different programs in the 

3 0 left hand column. The frequency of the number of frames 
may be selected, as desired. Also a set of frames are 
displayed on the lower portion of the display 
representative of different frames during the particular 
selected program. Referring to FIG. 8, a shot view of 

35 particular programs may be displayed, as desired. A 

representative frame of a shot is displayed in the center 
of the display with a set of representative frames of 
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different programs in the left hand column. Also a set 
of shots are displayed on the lower portion of the 
display representative of different shots (segments of a 
program, typically sequential in nature) during the 
5 particular selected program. Referring to FIG. 9, a key 
frame view of particular programs may be displayed, as 
desired. A representative frame is displayed in the 
center of the display with a set of representative frames 
of different programs in the left hand column. Also a 

10 set of key frame views are displayed on the lower portion 
of the display representative of different key frame 
portions during the particular selected program. The 
number of key frames in each key frame view can be 
adjusted by selecting the level. Referring to FIG. 10, a 

15 highlight view may likewise be displayed, as desired. 
Referring to FIG. 11, an event view may likewise be 
displayed, as desired. Referring to FIG. 12, a 
character/object view may likewise be displayed, as 
desired . 

2 0 An example of the description schemes is shown 

below in XML. The description scheme may be implemented 
in any language and include any of the included 
descriptions (or more) , as desired. 

The proposed program description scheme 

25 includes three major sections for describing a video 
program. The first section identifies the described 
program.. The second section defines a number of views 
which may be useful in browsing applications. The third 
section defines a number of profiles which may be useful 

30 in filtering and search applications. Therefore, the 

overall structure of the proposed description scheme is 
as follows: 

<?XML version="1.0"> 

<!DOCTYPE MPEG-7 SYSTEM "mpeg-7 . dtc"> 
3feogramIdentity> 

<ProgramID> . . . </PrograraID> 

<ProgramName> . . . </ProgramName> 
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<SourceLocation> . . - </SourceLocation> 
</ProgramIdentity> 
<ProgramViews> 

<ThumbnailView> . . . </ThumbnailView> 
5 <SlideView> . . . </SlideView> 
<FrameView> . . . </FrameView> 
<ShotView> . . . </ShotView> 
<KeyFrameView> . . . </KeyFrameView> 
<HighlightView> . . . </HighlightView> 
</EventView> 
. . </CloseUpView> 



10 <EventView> . 
<CloseUpView> 
<AlternateView> . . 

</ProgramViews> 

<ProgramProf iles> 

15 <GeneralProf ile> . 
<CategoryProf ile> 
<DateTimeProf ile> 
<KeywordProf ile> . 
<TriggerProf ile> . 

2 0 <StillProf ile> . . . 
<EventProf ile> . . . 
<CharacterProf ile> 
<ObjectProf ile> . . 
<ColorProf ile> . . . 

2 5 <TextureProf ile> . 
<ShapeProf ile> . . . 
<MotionProf ile> . . 

</ProgramProf iies> 



</AlternateView> 



</GeneralProf ile> 
< /Category Pro f ile> 
</DateTimeProf ile> 
</KeywordProf ile> 
< /Trigger Prof ile> 
</StillProf ile> 
< /Event Prof ile> 
. . . </CharacterProf ile> 

< /Object Prof ile> 
< /Color Pro f ile> 
. </TextureProf ile> 
</ Shape Pr of ile> 
< /Mot ion Prof ile> 
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Program Identity 



• Program ID 



<ProgramID> program-id </ProgramID> 



3 5 The descriptor <ProgramID> contains a number or a string 

to identify a program. 



• Program name 



<ProgramName> program-name </ProgramName> 

40 



1 
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The descriptor <ProgramName> specifies the name of a 
program . 

• Source location 

5 <SourceLocation> source-url </SourceLocation> 

The descriptor <SourceLocation> specifies the location of 
a program in URL format . 

Program Views 
10 • Thumbnail view 

<ThumbnailView> 

<Image> thumbnail -image </:mage> 
< /Thumbnail View> 

15 

The descriptor <ThumbnailView> specifies an image as the 
thumbnail representation of a program. 

• Slide view 

2 0 <SlideView> frame-id ... </SlideView> 

The descriptor <SlideView> specifies a number of frames 
in a program which may be viewed as snapshots or in a 
slide show manner. 

2 5 • Frame view 

<FrameView> start- frame -id end-frame-id </FrameView> 
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The descriptor <FrameView> specifies the start and end 
frames of a program. This is the most basic view of a 
program and any program has a frame view. 
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• Shot view 

<ShotView> 

<Shot id=""> start-frame-id end-frame-id display-frame-id </Shot> 
5 <Shot id=""> start-frame-id end-frame-id display-frame-id </Shot> 

</ShotView> 

The descriptor <ShotView> specifies a number of shots in 
10 a program. The <Shot> descriptor defines the start and 

end frames of a shot. It may also specify a frame to 
represent the shot . 



• Key- frame view 



15 <KeyFrameView> 

<KeyFrames level=""> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

2 0 </KeyFrames> 

<KeyFrames level=""> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

2 5 </KeyFrames> 

</KeyFrameView> 

The descriptor <KeyFrameView> specifies key frames in a 

3 0 program. The key frames may be organized in a 

hierarchical manner and the hierarchy is captured by the 
descriptor <KeyFrames> with a level attribute. The clips 
which are associated with each key frame are defined by 
the descriptor <Clip>. Here the display frame in each 
35 clip is the corresponding key frame. 



• Highlight view 



10 
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<HighlightView> 

<Highlight length=""> 

<Clip id=""> start-frame-id end-f rame-ic display-frame-id </Clip> 
<Clip id=""> start-frame-id end-frame-ic display-frame-id </Clip> 

</Highlight> 
<Highlight length=""> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 
<Clip id=""> start-frame-id end-f rame-id display-frame-id </Clip> 

</Highlight> 

</HighlightView> 

15 The descriptor <HighlightView> specifies clips to form 

highlights of a program. A program may have different 
versions of highlights which are tailored into various 
time length. The clips are grouped into each version of 
highlight which is specified by the descriptor 

20 <Highlight> with a length attribute. 



• Event view 



<EventView> 

<Events name=""> 

2 5 <Clip id=""> start-frame-id end-frame-ic display-frame-id </Clip> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 



</Events> 
<Events name=""> 

3 0 <Clip id= //// > start-frame-id end-frame-id display-frame-id </Clip> 

<Clip id= //// > start-frame-id end-frame-id display-frame-id </Clip> 

</Events> 
3 5 </EventView> 



The descriptor <EventView> specifies clips which are 
related to certain events in a program. The clips are 
grouped into the corresponding events which are specified 
40 by the descriptor <Event> with a name attribute. 
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• Close-up view 

<CloseUpView> 

<Target name=""> 

5 <Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

</Target> 
<Target name=""> 

10 <Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

<Clip id=""> start-frame-id end-frame-id display-frame-id </Clip> 

</Target> 

15 </CloseUpView> 

The descriptor <CloseUpView> specifies clips which may be 
zoomed in to certain targets in a program. The clips are 
grouped into the corresponding targets which are 
20 specified by the descriptor <Target> with a name 

attribute . 



• Alternate view 



<AlternateView> 

2 5 <AlternateSource id=""> source-url </AlternateSource> 

<AlternateSource id=""> source-url </AlternateSource> 

</AlternateView> 

30 The descriptor <Al ternateView> specifies sources which 

may be shown as alternate views of a program. Each 
alternate view is specified by the descriptor 
<AlternateSource> with an id attribute. The locate of the 
source may be specified in URL format. 
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Program Profiles 
• General profile 



<GeneralProf ile> 
5 <Title> title-text </Title> 

<Abstract> abstract-text </Abstract> 

<Audio> voice-annotation </Audio> 

<Www> web-page-url </Www> 

<ClosedCaption> yes/no </ClosedCaption> 
10 <Language> language-name </Language> 

<Rating> rating </Rating> 

<Length> time </Length> 

<Authors> author-name . . . </Authors> 

<Producers> producer-name . . . </Producers> 
15 <Directors> director-name ... </Directors> 

<Actors> actor-name . . . </Actors> 

</GeneralProf ile> 

20 The descriptor <GeneralProf ile> describes the general 

aspects of a program. 



• Category profile 



<CategoryProf ile> category-name . . . </CategoryProf ile> 

25 

The descriptor <CategoryProf ile> specifies the categories 
under which a program may be classified. 



• Date-time profile 



3 0 <DateTimeProf ile> 

<ProductionDate> date </ProductionDate> 
<ReleaseDate> date </ReleaseDate> 
<RecordingDate> date </RecprdingDate> 
<RecordingTime> time < /RecordingTime> 

35 

</DateTimeProf ile> 



1 




35 

The descriptor <DateTimeProf ile> specifies various date 
and time information of a program. 



• Keyword profile 

5 <KeywordProf ile> keyword . . . </KeywordProf ile> 

The descriptor <KeywordProf ile> specifies a number of 
keywords which may be used to filter or search a program. 

• Trigger profile 

10 

<TriggerProf ile> trigger-f rame-id . . . </TriggerProf ile> 

The descriptor <TriggerProf ile> specifies a number of 
frames in a program which may be used to trigger certain 
15 actions while the playback of the program. 



• Still profile 

<StillProf ile> 

<Still id=""> 
2 0 <HotRegion id =""> 

<Location> xl yl x2 y2 </Location> 
<Text> text-annotation </Text> 
<Audio> voice-annotation </Audio> 
<Www> -web-page-url </Www>- 

2 5 </HotRegion> 

<HotRegion id — ""> 

<Location> xl yl x2 y2 </Location> 
<Text> text-annotation </Text> 
<Audio> voice-annotation </Audio> 

3 0 <Www> web-page-url </Www> 

</HotRegion> 



</Still> 

<Still id=""> 
3 5 <HotRegion id = ""> 

<Location> xl yl x2 y2 </Location> 
<Text> text-annotation </Text> 
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<Audio> voice-annotation </Audio> 

<Www> web-page-url </Www> 
</HotRegion> 
<HotRegion id —""> 
5 <Location> xl yl x2 y2 </Location> 

<Text> text-annotation </Text> 

<Audio> voice-annotation </Audio> 

<Www> web-page-url </Www> 
</HotRegion> 

10 

</Still> 
</StillProf ile> 

15 The descriptor <StillProf ile> specifies hot regions or 

regions of interest within a frame. The frame is 
specified by the descriptor <Still> with an id attribute 
which corresponds to the frame -id. Within a frame, each 
hot region is specified by the descriptor <HotRegion> 

20 with an id attribute. 




• Event profile 



<EventProf ile> 

<EventList> event-name . . . </EventList> 

2 5 <Event name=""> 

<Www> web-page-url </Www> 
Occurrence id=""> 

<Duration> start-frame-id end-frame-id </Duration> 

<Text> text-annotation </Text> 

3 0 <Audio> voice-annotation </Audio> 

</Occurrence> 
Occurrence id=""> 

<Duration> start- frame-id end-frame-id </Duration> 

<Text> text-annotation </Text> 

3 5 <Audio> voice-annotation </Audio> 

</Occurrence> 

</Event> 
<Event name=""> 

4 0 <Www> web-page-url </Www> 

Occurrence id=""> 

<Duration> start- frame-id end-frame-id </Duration> 



<Text> text-annotation </Text> 

<Audio> voice-annotation </Audio> 
</Occurrence> 
<Occurrence id=""> 
5 <Duration> start-frame-id end-frame-id </Duration> 

<Text> text-annotation </Text> 

<Audio> voice-annotation </Audio> 
</Occurrence> 

10 </Event> 

' </EventProf ile> 

The descriptor <EventProf ile> specifies the detailed 
15 information for certain events in a program. Each event 

is specified by the descriptor <Event> with a name 
attribute. Each occurrence of an event is specified by 
the descriptor <Occurrence> with an id attribute which 
may be matched with a clip id under <EventView>. 

2 0 • Character profile 

<CharacterProf ile> 

<CharacterList> character-name . . . </CharacterList> 
Character name=""> 

2 5 <ActorName> actor-name </ActorName> 

<Gender> male </Gender> 
<Age> age </Age> 
<Www> web-page-url </Www>_ 
Occurrence id=""> 

3 0 <Duration> start-frame-id end-frame-id </Duration> 

<Location> frame: [xl yl x2 y2 ] ... </Location> 
<Motion> v x v y v 2 v Q v p v Y </Motion> 
<Text> text-annotation </Text> 
<Audio> voice-annotation </Audio> 

3 5 </Occurrence> 

Occurrence id=""> 

<Duration> start- frame-id end-frame-id </Duration> 
<Location> frame: [xl yl x2 y2] ... </Location> 
<Motion> v x v y v z v Q v p v Y </Motion> 

4 0 <Text> text-annotation </Text> 

<Audio> voice-annotation </Audio> 
</ Occur rence> 
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</Character> 
<Character name=""> 

<ActorName> actor-name < /ActorName> 
5 <Gender> male </Gender> 

<Age> age </Age> 
<Www> web-page-url </Www> 
Occurrence id=""> 

<Duration> start-frame-id end-frame-id </Duration> 
10 <Location> frame: [xl yl x2 y2 ] ... </Location> 

<Motion> v x v y v z v a v 3 v Y </Motion> 
<Text> text-annotation </Text> 
<Audio> voice-annotation </Audio> 
< /Occur rence> 
15 Occurrence id=""> 

<Duration> start-frame-id end-frame-id </Duration> 
<Location> frame: [xl yl x2 y2 ] ... </Location> 
<Motion> v x v y v z v Q v p v Y </Motion> 
<Text> text-annotation </Text> 
2 0 <Audio> voice-annotation </Audio> 

< /Occur rence> 

</Character> 

25 </CharacterProf ile> 

The descriptor <CharacterProf ile> specifies the detailed 
information for certain characters in a program. Each 
character is specified by the descriptor <Character> with 
30 a name attribute. Each occurrence of a character is 

specified by the descriptor <Occurrence> with an id 
attribute which may be matched with a clip id under 
<CloseUpView> . 



• Object profile 

35 

<ObjectProf ile> 

<ObjectList> object-name ... </Ob j ectList> 
<Object name=""> 

<Www> web-page-url </Www> 
4 0 <Occurrence id=""> 

<Duration> start- frame-id end-frame-id </Duration 
<Location> frame: [xl yl x2 y2] ... </Location> 
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<Motion> v x v y v 2 v Q v p v Y </Motion> 
<Text> text-annotation </Text> 
<Audio> voice-annotation </Audio> 
</ Occur rence> 
5 Occurrence id=""> 

<Duration> start- frame- id end-frame-id </Duration> 
<Location> frame: [xl yl x2 y2] ... </Location> 
<Motion> v x v y v z v Q v p v y </Motion> 
<Text> text-annotation </Text> 
1Q <Audio> voice-annotation </Audio> 

</Occurrence> 

</Object> 
<0bject name=""> 
15 <Www> web-page-url </Www> 

Occurrence id=""> 

<Duration> start-frame-id end-frame-id </Duration> 
<Location> frame: [xl yl x2 y2 ] ... </Location> 
<Motion> v x v y v 2 v Q v p v Y </Motion> 
2 0 <Text> text-annotation </Text> 

<Audio> voice-annotation </Audio> 
</Occurrence> 
Occurrence id=""> 

<Duration> start-frame-id end-frame-id </Duration> 
25 <Location> frame: [xl yl x2 y2 ] ... </Location> 

<Motion> v x v y v 2 v Q v P v Y </Motion> 
<Text> text-annotation </Text> 
<Audio> voice-annotation </Audio> 
< /Occur rence> 
30 ... 

</Obj ect> 

< /Object Prof ile> 

35 1 The descriptor <Obj ectProf ile> specifies the detailed 
information for certain objects in a program. Each object 
is specified by the descriptor <Object> with a name 
attribute. Each occurrence of a object is specified by 
the descriptor <Occurrence> with an id attribute which 

40 may be matched with a clip id under <CloseUpView> . 



• Color profile 




<ColorProf ile> 
</ColorProf ile> 

The descriptor <ColorProf ile> specifies the detailed 
color information of a program. All MPEG-7 color 
descriptors may be placed under here. 



• Texture profile 



10 <TextureProf ile> 
</TextureProf ile> 

The descriptor <TextureProf ile> specifies the detailed 
15 texture information of a program. All MPEG-7 texture 

descriptors may be placed under here . 



• Shape profile 



<ShapeProf ile> 

20 

</ShapeProf ile> % 

The descriptor <ShapeProf ile> specifies the detailed 
shape information of a program. All MPEG-7 shape 
25 descriptors may be placed under here. 

• Motion profile 

<MotionProf ile> 
3 0 </MotionProf ile> 

The descriptor <Mot ionProf ile> specifies the detailed 
motion information of a program. All MPEG-7 motion 
descriptors may be placed under here. 
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User Description Scheme 

The proposed user description scheme includes three major 
sections for describing a user. The first section 
identifies the described user. The second section records 
5 a number of settings which may be preferred by the user. 
The third section records some statistics which may reflect 
certain usage patterns of the user. Therefore, the overall 
structure of the proposed description scheme is as follows: 

<?XML version="1.0"> 

10)0CTYPE MPEG-7 SYSTEM "mpeg-7 . dtd"> 
<UserIdentity> 

<UserID> . . . </UserID> 
<UserName> . . . </UserName> 
</UserIdentity> 
<Ll&erPreferences> 

<BrowsingPreferences> . . . </BrowsingPref erences> 
<FilteringPreferences> . . - </FilteringPref erences> 
<SearchPreferences> ... </SearchPref erences> 
<DevicePreferences> - . . </DevicePreferences> 
2/QjserPref erences> 
<UserHistory> 

<BrowsingHistory> . . . </BrowsingHistory> 
<FilteringHistory> ... </ FilteringHis tory> 
<SearchHistory> . . . </SearchHistory> 
2 5 <DeviceHistory> ... </DeviceHistory> 
</UserHistory> 
<UserDemographics> 
<Age> . . . </Age> 
<Gender> . . . </Gender> 
30 <ZIP> . . . </ZIP> 
</UserDemographics> 

User Identity 
• User ID 

35 

<UserID> user-id </UserID> 

The descriptor <UserID> contains a number or a string to 
identify a user. 
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• User name 

<UserName> user-name </UserName> 

The descriptor <UserName> specifies the name of a user. 

User Preferences 
• Browsing preferences 



<BrowsingPref erences> 

10 <Views> 

<ViewCategory id=""> view-id . . . </ViewCategory> 
<ViewCategory id=""> view-id . . . </ViewCategory> 

</Views> 

15 <FrameFrequency> frequency . . . <FrameFrequency> 

<ShotFrequency> frequency . . . <ShotFrequency> 
. <KeyFrameLevel> level-id . . . <KeyFra m eLevel> 
<HighlightLength> length . . . <HighlightLength> 

2 0 </BrowsingPref erences> 

The descriptor <BrowsingPref erences> specifies the 
browsing preferences of a user. The user's preferred 
views are specified by the descriptor <Views>. For each 
25 category, the preferred views ' are specified by the 

descriptor <ViewCategory> with an id attribute which 
corresponds to the category id. The descriptor 
<PrameFrequency> specifies at what interval the frames 
should be displayed on a browsing slider under the frame 
30 view. The descriptor <ShotFrequency> specifies at what 

interval the shots should be displayed on a browsing 
slider under the shot view. The descriptor 
<KeyFrameLevel> specifies at what level the key frames 
should be displayed on a browsing slider under the key 
35 frame view. The descriptor <HighlightLength> specifies 

which version of the highlight should be shown under the 
highlight view. 
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• Filtering preferences 

<FilteringPref erences> 

<Categories> category-name . . . </Categories> 
5 <Channels> channel-number . . . </Channels> 

<Ratings> rating-id . . . </Ratings> 
<Shows> show-name . . . </Shows> 
<Authors> author-name . . . </Authors> 
<Producers> producer-name . . . </Producers> 
10 <Directors> director-name ... </Directors> 

<Actors> actor-name . . . </Actors> 
<Keywords> keyword . . . </Keywords> 
<Titles> title-text . . . </Titles> 

15 </FilteringPref erences> 

The descriptor <FilteringPref erences> specifies the 
filtering related preferences of a user. 

• Search preferences 

20 

<SearchPref erences> 

<Categories> category-name . . . </Categories> 
<Channels> channel-number . . . </Channels> 
<Ratings> rating-id . . . </Ratings> 

2 5 <Shows> show-name ... </Shows> 

<Authors> author-name . . . </Authors> 
<Producers> producer-name . . . </Producers> 
<Directors> director-name . . . </Directors> 
<Actors> actor-name . . . </Actors> 

3 0 <Keywords> keyword ... </Keywords> 

<Titles> title-text . . . </Titles> 

</ SearchPref erences> 

35 The descriptor <SearchPref erences> specifies the search 

related preferences of a user. 

• Device preferences 

<DevicePreferences> 
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<Brightness> brightness-value </Brightness> 
<Contrast> contrast-value </Contrast> 
<Volume> volume -value </Volume> 
< /Device Pre ferences> 

5 

The descriptor <DevicePref erences> specifies the device 
preferences of a user. 



Usage History 
• Browsing history 

10 

<BrowsingHistory> 
<Views> 

<ViewCategory id=""> view-id . . . </ViewCategory> 
<ViewCategory id=""> view-id . . . </ViewCategory> 

15 

</Views> 

<FrameFrequency> frequency . . . <FrameFrequency> 
<ShotFrequency> frequency . . . <Shot Frequency> 
<KeyFrameLevel> level-id . . . <KeyFrameLevel> 
2 0 <HighlightLength> length . . . <Highl ightLength> 

< /Brows ingHis tor y> 

The descriptor <BrowsingHistory> captures the history of 
25 a user's browsing related activities. 



• Filtering history 

< Filter ingHis tor y> 

<Categories> category-name . . . </Categories> 

3 0 <Channels> channel-number ... </Channels> 

<Ratings> rating-id . . . </Ratings> 
<Shows> show-name . ♦ . </Shows> 
<Authors> author-name . . . </Authors> 
<Producers> producer-name . . . </Producers> 

3 5 <Directors> director-name ... </Directors> 

<Actors> actor-name . . . </Actors> 
<Keywords> keyword . . . </Keywords> 
<Titles> title-text . . . </Titles> 
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</FilteringHistory> 

The descriptor <FilteringHistory> captures the history of 
5 a user's filtering related activities. 

• Search history 

<SearchHistory> 

<Categories> category-name - . . </Categories> 
10 <Channels> channel -number ... </Channels> 

<Ratings> rating-id . . . </Ratings> 

<Shows> show-name . . . </Shows> 

<Authors> author-name . . . </Authors> 

<Producers> producer-name . . . </Producers> 
15 <Directors> director-name . . . </Directors> 

<Actors> actor-name . . . </Actors> 

<Keywords> keyword . . . </Keywords> . 

<Titles> title-text . . . </Titles> 

2 0 </SearchHistory> 

The descriptor <SearchHistory> captures the history of a 
user's search related activities. 

• Device history 

25 

< Devi ceHi s tor y> 

<Brightness> brightness-value . . . </Brightness> 
<Contrast> contrast-value . . . </Contrast> 
<Volume> volume-value . . . </Volume> 

3 0 </DeviceHistory> 

The descriptor <DeviceHistory> captures the history of a 
user's device related activities. 
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User demographics 



Age 



<Age> age </Age> 

5 

The descriptor <Age> specifies the age of a user. 
• Gender 

<Gender> . . . </Gender> 



The descriptor <Gender> specifies the gender of a user. 
• ZIP code 

<ZIP> . . . </ZIP> 

15 

The descriptor <ZIP> specifies the ZIP code of where a 
user lives . 

System Description Scheme 

The proposed system description scheme includes four major 
20 sections for describing a user. The first section 
identifies the described system. The second section keeps 
a list of all known users. The third section keeps lists of 
available programs. The fourth section describes the 
capabilities of the system. Therefore, the overall 
25 structure of the proposed description scheme is as follows: 

<?XML version="1.0"> 

<!DOCTYPE MPEG-7 SYSTEM "mpeg- 7 . dtd"> 
<SystemIdentity> 

<SystemID> . . . </SystemID> 
3 0 <SystemName> . . . </SystemName> 

<SystemSerialNumber> . . . </ SystemSerialNumber> 
</SystemIdentity> 



10 
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<SystemUsers> 

<Users> . . - </Users> 

</ SystemUsers> 

<SystemPrograms> 
5 <Categories> . . . </Categories> 
<Channels> . . . </Channels> 
<Programs> . . . </Programs> 

</SystemPrograms> 

<SystemCapabilities> 

10<Views> ... </Views> 

</SystemCapabilities> 



System Identity 
• System ID 



<SystemID> system-id </SystemID> 



The descriptor <SystemID> contains a number or a string 
to identify a video system or device. 



2 0 • System name 



<SystemName> system-name </SystemName> 



The descriptor <SystemName> specifies the name of a video 



• System serial number 



<SystemSerialNumber> system-serial-number </SystemSerialNumber> 



15 



25 



system or device . 



30 



The descriptor <SystemSerialNumber> specifies the serial 
number of a video system or device . 
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System Users 

• Users 

<Users> 
5 <User> 

<UserID> user-id </UserID> 
<UserName> user-name </UserName> 
</User> 
<User> 

10 <UserID> user-id </UserID> 

<UserName> user-name </UserName> 
</User> 

</Users> 

15 

The descriptor <SystemUsers> lists a number of users who 
have registered on a video system or device. Each user is 
specified by the descriptor <User> . The descriptor 
<UserID> specifies a number or a string which should 
20 match with the number or string specified in <UserID> in 

one of the user description schemes. 

Programs in the System 

* Categories 

25 <Categories> 

<Category> 

<CategoryID> category-id </CategoryID> 

<CategoryName> category-name </CategoryName> 

<SubCategories> sub-category-id . . . </SubCategories> 
3 0 </Category> 
<Category> 

<CategoryID> category-id </CategoryID> 

<CategoryName> category-name </CategoryName> 

<SubCategories> sub-category- id . . . </SubCategories> 
3 5 </Category> 




</Categories> 



The descriptor <Categories> lists a number of categories 
which have been registered on a video system or device. 
Each category is specified by the descriptor <Category>. 
The major-sub relationship between categories is captured 
5 by the descriptor < Subcategories;* . 

• Channels 

<Channels> 

<Channel> 

10 <ChannelID> channel-id </ChannelID> 

<ChannelName> channel-name </ChannelName> 

<SubChannels> sub-channel-id . . . </SubChannels> 
</Channel> 
<Channel> 

15 <ChannelID> channel-id </ChannelID> 

<ChannelName> channel-name </ChannelName> 
<SubChannels> sub-channel-id . . . </SubChannels> 
</Channel> 

2 0 </Channels> 

The descriptor <Channels> lists a number of channels 
which have been registered on a video system or device. 
Each channel is specified by the descriptor <Channel>. 
25 The major-sub relationship between channels is captured 

by the descriptor < SubChannels> . 

• Programs 

<Programs> 

3 0 <CategoryPrograms> 

<CategoryID> category-id </CategoryID> 

<Programs> program-id . . . </Programs> 
</CategoryPrograms> 
<Ca t ego ry Prog rams > 
3 5 <CategoryID> category- id </CategoryID> 

<Programs> program-id . . . </Programs> 
</CategoryPrograms> 

<Channel Prog rams > 
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<ChannelID> channel-id </ChannelID> 

<Programs> program-id . . - </Programs> 
</ChannelPrograms> 
<ChannelPrograms> 
5 <ChannelID> channel-id </ChannelID> 

<Programs> program-id . . . </Programs> 
</ Channel Programs > 

</Programs> 

10 

The descriptor <Programs> lists programs who are 
available on a video system or device. The programs are 
grouped under corresponding categories or channels. Each 
group of programs are specified by the descriptor 
15 < Category Programs > or < Channel Programs > . Each program id 

contained in the descriptor <Programs> should match with 
the number or string specified in <ProgramID> in one of 
the program description schemes. 

System Capabilities 
2 0 • Views 




<Views> 

<View> 

<ViewID> view-id </ViewID> 

2 5 <ViewName> view-name </ViewName> 

</View> 
<View> 

<ViewID> view-id </ViewID> 
<ViewName> view-name </ViewName> 

3 0 </View> 

</Views> 

The descriptor <Views> lists views which are supported 
35 by a video system or device. Each view is specified by 

the descriptor <View>. The descriptor <ViewName> 
contains a string which should match with one of the 
following views used in the program description 
schemes: Thumbnail View, SlideView, FrameView, ShotView, 
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KeyFrameView, HighlightView, EventView, and 
CloseUpView . 



The present inventors came to the realization 
5 that the program description scheme may be further 

modified to provide additional capabilities. Referring 
to FIG. 13, the modified program description scheme 400 
includes four separate types of information, namely, a 
syntactic structure description scheme 402, a semantic 

10 structure description scheme 404, a visualization 
description scheme 406, and a meta information 
description scheme 408. It is to be understood that in 
any particular system one or more of the description 
schemes may be included, as desired. 

15 Referring to FIG. 14, the visualization 

description scheme 406 enables fast and effective 
browsing of video program (and audio programs) by 
allowing access to the necessary data, preferably in a 
one-step process. The visualization description scheme 

20 406 provides for several different presentations of the 
video content (or audio) , such as for example, a 
thumbnail view description scheme 410, a key frame view 
description scheme 412, a highlight view description 
scheme 414, an event view description scheme 416, a 

25 close-up view description scheme 418, and an alternative 
view description scheme 420. Other presentation 
techniques and description schemes may be added, as 
desired. The thumbnail view description scheme 410 
preferably includes an image 422 or reference to an image 

3 0 representative of the video content and a time reference 
424 to the video. The key frame view description scheme 
412 preferably includes a level indicator 426 and a time 
reference 428. The level indicator 426 accommodates the 
presentation of a different number of key frames for the 

3 5 same video portion depending on the user's preference. 
The highlight view description scheme 414 includes a 
length indicator 430 and a time reference 432. The 
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length indicator 430 accommodates the presentation of a 
different highlight duration of a video depending on the 
user's preference. The event view description scheme 416 
preferably includes an event indicator 434 for the 
5 selection of the desired event and a time reference 436. 
The close-up view description scheme 418 preferably 
includes a target indicator 438 and a time reference 440. 
The alternate view description scheme preferably includes 
a source indicator 442 . To increase performance of the 

10 system it is preferred to specify the data which is 
needed to render such views in a centralized and 
straightforward manner. By doing so, it is then feasible 
to' access the data in a simple one-step process without 
complex parsing of the video. 

15 Referring to FIG. 15, the meta information 

description scheme 408 generally includes various 
descriptors which carry general information about a video 
(or audio) program such as the title, category/ keywords, 
etc. Additional descriptors, such as those previously 

20 described, may be included, as desired. 

Referring again to FIG. 13, the syntactic 
structure description scheme 402 specifies the physical 
structure of a video program (or audio), e.g., a table of 
contents. The physical features, may include for 

25 example, color, texture, motion, etc. The syntactic 
structure description scheme 402 preferably includes 
three modules, namely a segment description scheme 4 50, a 
region description scheme 452, and a segment/region 
relation graph description scheme 454. The segment 

30 description scheme 450 may be used to define 

relationships between different portions of the video 
consisting of multiple frames of the video. A segment 
description scheme 450 may contain another segment 
description scheme 450 and/or shot description scheme to 

3 5 form a segment tree. Such a segment tree may be used to 
define a temporal structure of a video program. Multiple 
segment trees may be created and thereby create multiple 
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table of contents. For example, a video program may be 
segmented into story units, scenes, and shots, from which 
the segment description scheme 450 may contain such 
information as a table of contents. The shot description 
5 scheme may contain a number of key frame description 

schemes, a mosaic description scheme (s), a camera motion 
description scheme (s), etc. The key frame description 
scheme may contain a still image description scheme which 
may in turn contains color and texture descriptors. It 

10 is noted that various low level descriptors may be 

included in the still image description scheme under the 
segment description scheme. Also, the visual descriptors 
may be included in the region description scheme which is 
not necessarily under a still image description scheme. 

15 On example of a segment description scheme 450 is shown 
in FIG. 16. 

Referring to FIG. 17, the region description 
scheme 452 defines the interrelationships between- groups 
of pixels of the same and/or different frames of the 

20 video. The region description scheme 452 may also 

contain geometrical features, color, texture features, 
motion features, etc. 

Referring to FIG. 18, the segment/region 
relation graph description scheme 454 defines the 

25 interrelationships between a plurality of regions (or 

region description schemes) , a plurality of segments (or 
segment description schemes) , and/or a plurality of 
regions (or description schemes) and segments (or 
description schemes) . 

30 Referring again to FIG. 13, the semantic 

structure description scheme 404 is used to specify 
semantic features of a video program (or audio), e.g. 
semantic events. In a similar manner to the syntactic 
structure description scheme, the semantic structure 

35 description scheme 404 preferably includes three modules, 
namely an event description scheme 480, an object 
description scheme 482, and an event/objection relation 
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graph description scheme 484. The event description 
scheme 4 80 may be used to form relationships between 
different events of the video normally consisting of 
multiple frames of the video. An event description 
5 scheme 480 may contain another event description scheme 
48 0 to form a segment tree. Such an : event segment tree 
may be used to define a semantic index table for a video 
program. Multiple event trees may be created and thereby 
creating multiple index tables. For example, a video 

10 program may include multiple events, such as a basketball 
dunk, a fast break, and a free throw, and the event 
description scheme may contain such information as an 
index table. The event description scheme may also 
contain references which link the event to the 

15 corresponding segments and/or regions specified in the 

syntactic structure description scheme. On example of an 
event description scheme is shown in FIG. 19. 

Referring to FIG. 20, the object description 
scheme 482 defines the interrelationships between groups 

20 of pixels of the same and/or different frames of the 

video representative of objects. The object description 
scheme 4 82 may contain another object description scheme 
and thereby form an object tree. Such an object tree may 
be used to define an object index table for a video 

25 program. The object description scheme may also contain 
references which link the object to the corresponding 
segments and/or regions specified in the syntactic 
structure description scheme. 

Referring to FIG. 21, the event/object relation 

3 0 graph descripti on scheme 484 defines the 

interrelationships between a plurality of events (or 
event description schemes), a plurality of objects (or 
object description schemes), and/or a plurality of events 
(or description schemes) and objects (or description 

3 5 schemes) . 

After further consideration, the present 
inventors came the realization that the particular design 
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of the user preference description scheme is important to 
implement portability, while permitting adaptive 
updating, of the user preference description scheme. 
Moreover, the user preference description scheme should 
5 be readily usable by the system while likewise being 

suitable for modification based on the user's historical 
usage patterns. It is possible to collectively track all 
users of a particular device to build a database for the 
historical viewing preferences of the users of the 

10 device, and thereafter process the data dynamically to 
determine which content the users would likely desire. 
However, this implementation would require the storage of 
a large amount of data and the associated dynamic 
processing requirements to determine the user 

15 preferences. It is to be understood that the user 

preference description scheme may be used alone or in 
combination with other description scheme. 

Referring to FIG. 22, to achieve portability 
and potentially decreased processing requirements the 

20 user preference description scheme 20 should be divided 

into at least two separate description schemes, namely, a 
usage preference description scheme 500 and a usage 
history description scheme 502 . The usage preference 
description scheme 500, described in detail later, 

25 includes a description scheme of the user's audio and/or 
video consumption preferences. The usage preference 
.description scheme 500 describes one or more of the 
following, depending on the particular implementation, 
(a) browsing preferences, (b) filtering preferences, (c) 

3 0 searching preferences, and (d) device preferences of the 
user. The type of preferences shown in the usage 
preference description scheme 500 are generally 
immediately usable by the system for selecting and 
otherwise using the available audio and/or video content. 

35 In other words, the usage preference description scheme 
500 includes data describing audio and/or video 
consumption of the user. The usage history description 
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scheme 502, described in detail later, includes a 
description scheme of the user's historical audio and/or 
video activity, such as browsing, device settings, 
viewing, and selection. The usage history description 
5 scheme 502 describes one or more of the following, 

depending on the particular implementation, (a) browsing 
history, (b) filtering history, (c) searching history, and 
(d) device usage history. The type of preferences shown 
in the usage history description scheme 502 are not 

10 generally immediately usable by the system for selecting 
and otherwise using the available audio and/or video 
content. The data contained in the usage history 
description scheme 502 may be considered generally 
"unprocessed", at least in comparison to the data 

15 contained in the usage preferences description scheme 500 
because it generally contains the historical usage data 
of the audio and/or video content of the viewer. 

In general, capturing the user's usage history 
facilitates "automatic" composition of user preferences by 

2 0 a machine, as desired. When updating the user preference 

description scheme 500 it is desirable that the usage 
history description scheme 502 be relatively symmetric to 
the usage preference description scheme 500. The 
symmetry permits more effective updating because less 
25 interpretation between the two description schemes is 
necessary in order to determine what data should be 
included in the preferences. Numerous algorithms can 
then be applied in utilization of the history information 
in deriving user preferences. For instance, statistics 

3 0 can be computed from the history and utilized for this 

purpose . 

After consideration of the usage preference 
description 500 and the usage history description 502, 
the present inventors came to the realization that in the 
3 5 home environment many different users with different 

viewing and usage preferences may use the same device. 
For example, with a male adult preferring sports, a 
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female adult preferring afternoon talk shows, and a three 
year old child preferring children's programming, the 
total information contained in the usage preference 
description 500 and the usage history description 502 
5 will not be individually suitable for any particular 

user. The resulting composite data and its usage by the 
device is frustrating to the users because the device 
will not properly select and present audio and/or video 
content that is tailored to any particular user. To 

10 alleviate this limitation, the user preference 

description 20 may also include a user identification 
(user identifier) description 504. The user 
identification description 504 includes an identification 
of the particular user that is using the device. By 

15 incorporating a user identification description 504 more 
than one user may use the device while maintaining a 
different or a unique set of data within the usage 
preference description 500 and the usage history 
description 502. Accordingly, the user identification 

20 description 504 associates the appropriate usage 
preference description (s) 500 and usage history 
description (s) 502 for the particular user identified by 
the user identification description 504. With multiple 
user identification descriptions 504, multiple entries 

25 within a single user identification description 504 

identifying different users, and/or including the user 
identification description within the usage preference 
description 500 and/or usage history description 502 to 
provide the association therebetween, multiple users can 

3 0 readily use the same device while maintaining their 

individuality. Also, without the user identification 
description in the preferences and/or history, the user 
may more readily customize content anonymously. In 
addition, the user's user identification description 504 

35 may be used to identify multiple different sets of usage 
preference descriptions 500 -- usage history descriptions 
502, from which the user may select for present 
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interaction with the device depending on usage 
conditions. The use of multiple user identification 
descriptions for the same user is useful when the user 
uses dultiple different types of devices, such as a 
5 television, a home stereo, a business television, a hotel 
television, and a vehicle audio player, and maintains 
multiple different sets of preference descriptions. 
Further, the identification may likewise be used to 
identify groups of individuals, such as for example, a 

10 family. In addition, devices that are used on a 

temporary basis, such as those in hotel rooms or rental 
cars, the user identification requirements may be 
overridden by employing a temporary session user 
identification assigned by such devices. In applications 

15 where privacy concerns may be resolved or are otherwise 
not a concern, the user identification description 504 
may also contain demographic information of the user. In 
this manner, as the usage history description 502 
increases during use over time, this demographic data 

2 0 and/or data regarding usage patterns may be made 

available to other sources. The data may be used for any 
purpose, such as for example, providing targeted 
advertising or programming on the device based on such 
data. 

25 Referring to FIG. 23, periodically an agent 510 

processes the usage history description (s) 502 for a 
particular user to "automatically" determine the 
particular user's preferences. In this manner, the 
user's usage preference description 500 is updated to 

30 reflect data stored in the usage history description 502. 
This processing by the agent 510 is preferably performed 
on a periodic basis so that ^during normal operation the 
usage history description 502 does not need to be 
processed, or otherwise queried, to determine the user's 

3 5 current browsing, filtering, searching, and device 

preferences. The usage preference description 500 is 
relatively compact and suitable for storage on a portable 
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storage device, such as a smart card, for use by other 
devices as previously described. 

Frequently, the user may be traveling away from 
home with his smart card containing his usage preference 
5 description 500. During such traveling the user will 
likely be browsing, filtering, searching, and setting 
device preferences of audio and/or video content on 
devices into which he provided his usage preference 
description 500. However, in some circumstances the 

10 audio and/or video content browsed, filtered, searched, 
and device preferences of the user may not be typically 
what he is normally interested in. In addition, for a 
single device the user may desire more than one profile 
depending on the season, such as football season, 

15 basketball season, baseball season, fall, winter, summer, 
and spring. Accordingly, it may not be appropriate for 
the device to create a usage history description 502 and 
thereafter have the agent 510 "automatically" update the 
user's usage preference description 500. This will in 

20 effect corrupt the user's usage preference description 
500. Accordingly, the device should include an option 
that disables the agent 510 from updating the usage 
preference description 500. Alternatively, the usage 
preference description 500 may include one or more fields 

2 5 or data structures that indicate whether or not the user 

desires the usage preference description 500 (or portions 
thereof) to be updated. 

Referring to FIG. 24, the device may use the 
program descriptions provided by any suitable source 

3 0 describing the current and/or future audio and/or video 

content available from which a filtering agent 52 0 
selects the appropriate content for the particular 
user(s) . The content is selected based upon the usage 
preference description for a particular user 
35 identification (s) to determine a list of preferred audio 
and/or video programs . 
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As it may be observed, with a relatively 
compact user preference description 500 the user's 
preferences are readily movable to different devices, 
such as a personal video recorder, a TiVO player, a 
5 RePlay Networks player, a car audio player, or other 

audio and/or video appliance. Yet, the user preference 
description 500 may be updated in accordance with the 
user's browsing, filtering, searching, and device 
preferences . 

10 Referring to FIG. 25, the usage preference 

description 500 preferably includes three different 
categories of descriptions, depending on the particular 
implementation. The preferred descriptions include (a) 
browsing preferences description 530, (b) filtering and 

15 search preferences description, 532 and (c) device 

preferences description 534. The browsing preferences 
description 53 0 relates to the viewing preferences of 
audio and/or video programs. The filtering and search 
preferences description 532 relates to audio and/or video 

20 program level preferences. The program level preferences 
are not necessarily used at the same time as the 
(browsing) viewing preferences. For example, preferred 
programs can be determined as" a result of filtering 
program descriptions according to user's filtering 

25 preferences. A particular preferred program may 

subsequently be viewed in accordance with user's browsing 
preferences. Accordingly, efficient implementation may 
be achieved if the browsing preferences description 530 
is separate, at least logically, from the filtering and 

30 search preferences description 532. The device 

preferences description 534 relates to the preferences 
for setting up the device in relation to the type of 
content being presented, e.g. romance, drama, action, 
violence, evening, morning, day, weekend, weekday, and/or 

35 the available presentation devices. For example, 

presentation devices may include stereo sound, mono 
sound, surround sound, multiple potential displays, 
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multiple different sets of audio speakers, AC-3, and 
Dolby Digital. It may likewise be observed that the 
device preferences description 534 is likewise separate, 
at least logically, from the browsing description 530 and 
5 filtering/search preferences description 532. 

The browsing preferences description 53 0 
contains descriptors that describe preferences of the 
user for browsing multimedia (audio and/or video) 
information. In the case of video, for example, the 

10 browsing preferences may include user's, preference for 
continuous playback of the entire program versus 
visualizing a short summary of the program. Various 
summary types may be described in the program 
descriptions describing multiple different views of 

15 programs where these descriptions are utilized by the 

device to facilitate rapid non-linear browsing, viewing, 
and navigation. Parameters of the various summary types 
should also be specified, i.e., number of hierarchy 
levels when the keyframe summary is preferred, or the 

2 0 time duration of the video highlight when highlight 

summary is preferred. In addition, browsing preferences 
may also include descriptors describing parental control 
settings. A switch descriptor (set by the user) should 
also be included to specify whether or not the 

25 preferences can be modified without consulting the user 

first. This prevents inadvertent changing or updating of 
the preferences by the device. In addition, it is 
desirable that the browsing preferences are media content 
dependent. For example, a user may prefer 15 minute 

30 video highlight of a basketball game or may prefer to see 
only the 3 -point shots. The same user may prefer a 
keyframe summary with two levels of hierarchy for home 
videos . 

The filtering and search preferences 
35 description 532 preferably has four descriptions defined 
therein, depending on the particular embodiment. The 
keyword preferences description 540 is used to specify 
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favorite topics that may not be captured in the title, 
category, etc., information. This permits the acceptance 
of a query for matching entries in any of the available 
data fields. The content preferences description 542 is 
5 used to facilitate capturing, for instance, favorite 

actors, directors. The creation preferences description 
544 is used to specify capturing, for instance, titles of 
favorite shows. The classification preferences 
description 546 is used to specify descriptions, for 

10 instance, a favorite program category. A switch 

descriptor, activated by the user, may be included to 
specify whether or not the preferences may be modified 
without consulting the user, as previously described. 

The device preferences description 534 contains 

15 descriptors describing preferred audio and/or video 
rendering settings, such as volume, balance, bass, 
treble, brightness, contrast, closed captioning, AC- 3, 
Dolby digital, which display device of several, type of 
display device, etc. The settings of the device relate 

2 0 to how the user browses and consumes the audio and/or 

video content. It is desirable to be able to specify the 
device setting preferences in a media type and content- 
dependent manner. For example the preferred volume 
settings for* an action movie may be higher than a drama, 

25 or the preferred settings of bass for classical music and 
rock music may be different. A switch descriptor, 
activated by the user, may be included to specify whether 
or not the preferences may be modified without consulting 
the user, as previously described. 

30 Referring to FIG. 26, the usage preferences 

description may be used in cooperation with an MPEG- 7 
compliant data stream and/or device. MPEG- 7 descriptions 
are described in ISO/IEC JTC1/SC2 9/WG11 "MPEG-7 Media/Meta 
DSs (V0.2), August 1999, incorporated by reference 

35 herein. It is preferable that media content descriptions 
are consistent with descriptions of preferences of users 
consuming the media. Consistency can be achieved by 
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using common descriptors in media and user preference 
descriptions or by specifying a correspondence between 
user preferences and media descriptors. Browsing 
preferences descriptions are preferably consistent with 
5 media descriptions describing different views and 
summaries of the media. The content preferences 
description 542 is preferably consistent with, e.g., a 
subset of the content description of the media 553 
specified in MPEG-7 by content description scheme. The 

10 classification preferences description 544 is preferably 
consistent with, e.g., a subset of the classification 
description 554 defined in MPEG-7 as classification 
description scheme. The creation preferences description 
546 is preferably consistent with, e.g., a subset of the 

15 creation description 556 specified in MPEG-7 by creation 
description scheme. The keyword preferences description 
54 0 is preferably a string supporting multiple languages 
and consistent with corresponding media content 
description schemes. Consistency between media and user 

20 preference descriptions is depicted or shown in FIG. 26 
by couble arrows in the case of content, creation, and 
classification preferences. 

C^^_- Referring to FIG. 27, the usage history 

description 502 preferably includes three different 

25 categories\of ' descriptions , depending on the particular 
implementatW. The preferred descriptions include (a) 
browsing hisKory description 560, (b) filtering and 
search history, description 562, and (c) device usage 
history description 564, as previously described in 

30 relation to the \isage preference description 500. The 
filtering and seaVch history description 562 preferably 
has four descriptions defined therein, depending on the 
particular embodiment, namely, a keyword usage history 
description 566, a content usage history description 568, 

35 a creation preferences^ description 570, and a 

classification usage hrstory description 572, as 
previously described with respect to the preferences. 



The usage history description 502 may contain additional 
descriptors therein (or description if desired) that 
describe the time and/ or/ time duration of information 
contained therein. The time refers to the duration of 
5 consuming a particular audio and/or video program. The 
duration of time th^'t a particular program has been 
viewed provides information that may be used to determine 
user preferences./ For example, if a user only watches a 
show for 5 minut/es then it may not be a suitable 

10 preference for/inclusion the usage preference description 
500. In addition, the present inventors came to the 
realization /that an even more accurate measure of the 
user's preference of a particular audio and/or video 
program is the time viewed in light of the total duration 

15 of the p/ogram. This accounts for the' relative viewing 
duration of a program. For example watching 3 0 minutes 
of a 4 /hour show may be of less relevance than watching 
3 0 minutes of a 30 minute show to determine preference 
data /for inclusion in the usage preference description 

20 500. 

Referring to FIG. 28, an exemplary example of 
an audio and/or video program receiver with persistent 
storage is illustrated. As shown, audio/video program 
descriptions are available from the broadcast or other 

25 source, such as a telephone line. The user preference 
description facilitate personalization of the browsing, 
filtering and search, and device settings. In this 
embodiment, the user preferences are stored at the user's 
terminal with provision for transporting it to other 

3 0 systems, for example via a smart card. Alternatively, 
the user preferences may be stored in a server and the 
content adaptation can be performed according to user 
descriptions at the server and then the preferred content 
is transmitted to the user. The user may directly 

35 provide the user preferences, if desired. The user 

preferences and/or user history may likewise be provided 
to a service provider. The system may employ an 
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application that records user's usage history in the form 
of usage history description, as previously defined. The 
usage history description is then utilized by another 
application, e.g., a smart agent, to automatically map 
5 usage history to user preferences. 

The terms and expressions that have been 
employed in the foregoing specification are sued as terms 
of description and not of limitation, and there is no 
intention, in the use of such terms and expressions, of 
10 excluding equivalents of the features shown and described 
or portions thereof, it being recognized that the scope 
of the invention is defined and limited only by the 
claims that follow. 



